Chaos management during a major incident

No software system on the planet is today fully failure-resistant. Given this, it becomes crucial for software teams to be able to deal with major production incidents in a nimble way. However, just as complex systems fail, responding to a major system outage is a painful operational exercise that may at times require multiple stakeholders to work together. In this talk, Aish discusses how to efficiently deal with the human element, when complex systems fail.

Video

Slides

2 Likes